
Stochastic Coalitional 
Better-response 
Dynamics and Strong 
Nash Equilibrium 

Konstantin Avrachenkov, Vikas Vikram Singh 


RESEARCH 
REPORT 
N° 8716 

April 2015 

Project-Teams Maestro 


ISSN 0249-6399 ISRN INRIA/RR-8716-FR-hENG 





Stochastic Coalitional Better-response 
Dynamics and Strong Nash Equilibrium 

Konstantin Avrachenko’\Q, Vikas Vikram Singh 011 

Project-Teams Maestro 

Research Report n° 8716 — April 2015 — fTKl pages 


Abstract: We consider coalition formation among players in an n-player finite strategic game 

over infinite horizon. At each time a randomly formed coalition makes a joint deviation from a 
current action profile such that at new action profile all players from the coalition are strictly 
benefited. Such deviations define a coalitional better-response (CBR) dynamics that is in general 
stochastic. The CBR dynamics either converges to a strong Nash equilibrium or stucks in a closed 
cycle. We also assume that at each time a selected coalition makes mistake in deviation with small 
probability that add mutations (perturbations) into CBR dynamics. We prove that all strong 
Nash equilibria and closed cycles are stochastically stable, i.e., they are selected by perturbed 
CBR dynamics as mutations vanish. Similar statement holds for strict strong Nash equilibrium. 
We apply CBR dynamics to the network formation games and we prove that all strongly stable 
networks and closed cycles are stochastically stable. 

Key-words: Strong Nash equilibrium, Coalitional better-response. Stochastic stability. Network 

formation games. Strongly stable networks. 
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La dynamique de meilleure reponse de coalitions et I’equilibre de 

Nash fort 


Resume : Nous considerons un processus de formation de coalitions entre les joueurs d’un jeu 

fini strategique sur I’horizon de temps infini. A chaque etape, une coalition formee au hasard fait une 
deviation conjointe de I’ensemble actuel des actions de telle sorte qu’au nouveau ensemble des actions, 
tons les joueurs de la coalition sont strictement beneficie. Telles deviations definissent une dynamique 
de meilleure reponse de coalitions, Coalitional Better-Response dynamics en anglais (CBR), qui est en 
general stochastique. La dynamique CBR soit converge vers un equilibre de Nash fort ou a un cycle 
ferme. En outre, nous supposons que a chaque etape une coalition selectionnee fait une faute avec faible 
probabilite qui ajoutent des mutations (perturbations) dans la dynamique CBR. Nous prouvons que tous 
les equilibres de Nash forts et les cycles fermes sont stochastiquement stable, ce est a dire, ils sont choisis 
par CBR perturbee quand les mutations disparaissent. Une affirmation similaire a lieu pour I’equilibre 
de Nash fort et stricte. Nous appliquons la dynamique CBR aux jeux de formation de reseau et nous 
prouvons que tous les reseaux fortement stables et des cycles fermes sont stochastiquement stable. 

Mots-cles : Forte equilibre de Nash, Coalitionnelle meilleure reponse, Stabilite stochastique, Jeux de 
formation de reseau, Reseaux fortement stable. 
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1 Introduction 

Nash equilibrium is the most desirable solution concept in non-cooperative game theory. When a strategic 
game is played repeatedly over infinite horizon then the Nash equilibrium that is played in the long run 
depends on an initial action profile as well as the way all the players choose their actions at each time. 
Several discrete time dynamics have been studied in the literature to study the Nash equilibrium selection 
in the long run. Young m considered an n players strategic game where at each time all the players 
make a simultaneous move and each player chooses an action that is the best response to k previous 
games among the m, k < m, most recent games in past. In general this dynamics need not converge 
to a Nash equilibrium, it may stuck into a closed cycle. Young m also considered the case where at 
each time with small probability each player makes mistake and chooses some non-optimal action. These 
mistakes add mutations into the dynamics. In general the mutations can be sufficiently small which leads 
to the definition of stability of Nash equilibrium as mutations vanish. This type of stability is known as 
stochastic stability. Young proposed an algorithm to compute the stochastically stable Nash equilibria. 
For 2x2 coordination games he showed that the risk dominant Nash equilibrium is stochastically stable. 
Kandori et al. m considered a different dynamic model where at each time each player plays with 
every other player in pairwise contest. The pairwise contest is given by 2 x 2 symmetric matrix game 
and each player chooses an action which has higher expected average payoff. The mutations are present 
into dynamics due to wrong actions taken by the players. For 2x2 coordination games they showed 
that a risk dominant Nash equilibrium is stochastically stable. That is, for 2 x 2 coordination games the 
dynamics given by Young m and Kandori et al. m selects the same Nash equilibrium. Fudenberg 
et al. [7j proposed a dynamics where at each time only one player is selected to choose actions. The 
mutations with small probability also occur at each time. The risk dominant Nash equilibrium in 2 x 2 
coordination games need not be stochastically stable under this dynamics. 

The Nash equilibrium concept is inadequate for the situations where players can a priori communicate, 
being in a position to form a coalition and jointly deviate in a coordinated way. To capture such situations 
the strong Nash equilibrium (SNE) introduced by Aumann [T] is an adequate solution concept. From 
an SNE there is no coalition that can deviate to a new action profile such that at new action profile 
the actions of all players from outside of the coalition are same as at SNE and all the players from the 
coalition are strictly benefited. There is another equilibrium notion that is stronger than the SNE. Such 
equilibrium is called as strict strong Nash equilibrium (SSNE). From an SSNE there is no coalition that 
can deviate to a new action profile such that at new action profile the actions of all players from outside 
of the coalition are same as at SSNE and all the players from coalition get at least as much as at SSNE 
and at least one player is strictly benefited. It is clear that an SSNE is always an SNE. As motivated 
from the application of SNE in network formation games by Dutta and Mutuswami [3] and SSNE in 
network formation games by Jackson and van den Nouweland HU, Jackson m we restrict ourselves to 
only pure actions. A network that is stable against the deviations of all coalitions is called as strongly 
stable network and under top convexity condition on payoff functions it indeed exists as shown by Jackson 
and van den Nouweland HU. An SNE need not always exist and in such case there exists some set of 
action profiles forming a closed cycle such that it is possible to reach from one action profile to another 
via sequence of improving deviations from the coalitions; and it is not possible to reach an action profile 
outside of the closed cycle from an action profile belonging to closed cycle via improving deviations from 
the coalitions. 

There are many dynamics for equilibrium selection in the literature, as discussed before, describing 
various situations of the dynamic play. To the best of our knowledge so far no dynamics has been proposed 
that captures the situation where at each time players are allowed to form a coalition and make a move 
in a coordinated way. In this paper we propose a CBR dynamics where at each time players are allowed 
to form a coalition and make a joint deviation from the current action profile if it is strictly beneficial 
for all the members of the coalition. We assume that the coalition formation is random and at each 
time only one coalition can be formed. We also consider the situation where at each time the formed 
coalition makes wrong decision with small probability, i.e., they make a move to an action profile where 
all the players from the coalition are not strictly benefited. These mistakes work as mutations and add 
perturbations into CBR dynamics. We prove that the perturbed CBR dynamics selects all strong Nash 
equilibria and closed cycles in the long run as mutations vanish, i.e., all strong Nash equilibria and closed 
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cycles are stochastically stable. For 2x2 synimetric coordination games this dynamics always selects a 
payoff dominant Nash equilibrium instead of risk dominant Nash equilibrium because a payoff dominant 
Nash equilibrium is an SNE. The similar CBR dynamics can be given for the case where each time a 
coalition deviate from a current action profile such that all players from the coalition are at least as well 
off at new action profile and at least one player is strictly better off. Under such dynamics all strict 
strong Nash equilibria and closed cycles are stochastically stable. 

We apply CBR dynamics corresponding to SSNE to network formation games where nodes (players) 
of a network form a coalition and make a move to a new network if it offers each player at least as much 
as it is in the current network and at least one player gets strictly better payoff. The mutations are 
present due to the wrong decisions taken by the coalitions. We prove that all strongly stable networks 
and closed cycles are stochastically stable. 

The paper is organized as follows. Section [5] contains the model and few definitions. We describe the 
CBR dynamics in Section [31 Section [3] contains the application of CBR dynamics to network formation 
games. We conclude our paper in Section 0 As a by-product, we give an algorithm to compute an SNE 
in Appendix [A) 


2 The Model 

We consider an n-player strategic game whose components are defined as follows: 

1. TV = {1, 2, • • • , n} is a finite set of players. 

2. Ai is a finite set of actions of player i and its element is denoted by at- We denote A = nr=i 

a set of all action profiles and a = (oi, 02 , • • • , a„) denotes an element of A. Let S be the set of all 
coalitions among players. Eor a coalition S G S, define As = Iligs whose element is denoted by 
as and a-s denotes an action profile of players outside S. 

3. Mi : A —R is a payoff function of player i. Specifically, player i receives payoff Ui{ai, 02 , ■ • • , a„) 
when each player i, i = l,2,---,n, chooses action Oi. 

In non-cooperative games, the Nash equilibrium is stable against unilateral deviations, i.e., no player 
has an incentive to deviate unilaterally from it. But, the Nash equilibrium fails to capture the situation 
where a priori the players can communicate with each other. In such cases some of the players can form 
a coalition and jointly deviate from a current action profile if at new action profile each player from the 
coalition is strictly benefited. In some cases players also make a joint deviation from a current action 
profile if at new action profile all the players of coalition are at least as well off and at least one player is 
strictly better off. Such deviations lead to the definitions of strong Nash equilibrium [I] and strict strong 
Nash equilibrium which we define next. As motivated from the application of SNE in network formation 
games by Dutta and Mutuswami [3] and application of SSNE in network formation games by Jackson 
and van den Nouweland m, Jackson nni we restrict ourselves to pure actions. 

Definition 2.1 (Strong Nash Equilibrium). An action profile a* is said to be a strong Nash equilibrium 
if there is no S G S and a G A such that 

1. Qi = a*, V i ^ S. 

2. Ui{a) > Ui{a*), i G S. 

Let A(S', a) be the set of all action profiles reachable from a via deviation of coalition S. It is defined 
as, 

A{S,a) = {a\a^ = ai, 'i i ^ S and a( G Ai, \/ i G S'}. 

A coalition always has option to do nothing, so a G A{S,a). Let Ii(S, a) be the set of improved action 
profiles reachable from an action profile a via deviation of coalition S, i.e., 

Xi(S, a) = {a'|a' = ai, M i ^ S and Ui{a') > Ui{a), MiG S}. (1) 
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For an improved action profile a' S Ii (5', a) , an action profile a'g of all players from S is called as 
a better-response of coalition S against a fixed action profile a-s of the players outside S. Define, 
Di{S, a) = A(S, a) a) as a set of all action profiles due to the erroneous decisions of coalition S. It 

is clear that a ^ Di(S, a), so a £ Ii{S, a). That is, Ii(S', a) is always nonempty for all S and a. An SNE 
need not always exist. In such a case there exists some set of action profiles lying on a closed cycle and 
all such action profiles can be reached from each other via an improving path. The definitions of closed 
cycle and improving path are as follows: 

Definition 2.2 (Improving Path). An improving path from a to a' is a sequence of action profiles 
and coalitions a^, 5'i, a^, • • • , S'm-i, o™ such that = a, o’” = a' and £ Xi{Sk,a^) for all 

fc = I, 2, • • ■ , m — 1. 

Definition 2.3 (Cycles). A set of action profiles C form a cycle if for any a £ C and a' £ C there exists 
an improving path connecting a and a'. A cycle is said to be a closed cycle if no action profile in C lies 
on an improving path leading to an action profile that is not in C. 

Theorem 2.4. There always exists a strong Nash equilibrium or a closed cyele of action profiles. 

Proof. An action profile is an SNE if and only if it is not possible for any coalition to make an improving 
deviation from it to another action profile. So, start at an action prohle. Either it is SNE or there exists 
a coalition that can make an improving deviation to another action profile. In the first case result is 
established. Eor the second case the same thing holds, i.e., either this new action profile is an SNE or 
there exists a coalition that can make an improving deviation to another action profile. Given the finite 
number of action profiles, the above process either finds an action profile which is an SNE or it reaches 
to the starting action prohle, i.e., there exists a cycle. Thus, we have proved that there always exists 
either an SNE or a cycle. Suppose there are no strong Nash equilibria. Given the hnite number of action 
prohle and non-existence of strong Nash equilibria there must exists a maximal set C of action prohles 
such that for any a £ C and a' £ C there exists an improving path connecting a and a' and no action 
prohle in C lies on an improving path leading to an action prohle that is not in C. Such a set C is a 
closed cycle. □ 

An SSNE can be dehned similarly. An action prohle a* in Dehnition 12.11 is said to be SSNE if the 
condition 1 is same and the condition 2 is Ui{a) > Ui{a*) for all * € S' with at least one strict inequality. 
That is, a* is an SSNE if it is not possible for any coalition S £ S to deviate from a* to some a £ A such 
that the actions of all players outside S are same in both a and a* and at a all players from S are at 
least as well off as at a* and the payoff of at least one player at a is better than at a* . In this case, for 
the given action prohle a and coalition S the set of improved action prohles X 2 {S, a) is dehned as, 

l 2 {S,a) = {a^|a( = a^, \/ i £ S, and Ui(a') > Ui{a), V i £ S,Uj{a') > Uj{a), for some j £ (2) 

and I 2 [Sj a) = A{S, a)\I 2 (S, a). The dehnitions of improving path and cycles can be dehned analogously 
to previous case. A result similar to Theorem 12.41 holds, i.e., there always exists at least an SSNE or a 
closed cycle of action prohles. An SSNE is always an SNE, i.e., the set of strict strong Nash equilibria 
is a subset of the set of strong Nash equilibria. An SNE is a weakly Pareto optimal and an SSNE is a 
Pareto optimal. Now, we give few examples illustrating the presence of SNE, SSNE and closed cycle. 

Example 2.5. Consider a two player game 

h b2 

ai/(- 2 ,- 2 ) (- 10 ,- 1 )\ 

a2l^(-l,-10) (-5,-5)yl' 

The above game represents a famous example of prisoner’s dilemma. Here ( 02 ,^ 2 ) is the only Nash 
equilibrium that is not an SNE because both player can jointly deviate to (ai,&i) where both of them 
are strictly better off. So, in this game there is no SNE and SSNE. The closed cycle of action prohle is 
given in Figure [T] 

{ 12 } 

A directed edge ( 02 , 62 ) —^(ai, 6 i) of Figure [1] represents a deviation by coalition {1,2}. The other 
directed edges of the closed cycle are similarly dehned. 
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Example 2.6. Consider a two player game 



hi 

^2 

63 

Oi 

/(4,4) 

( 0 , 0 ) 

( 0 , 0 )\ 

02 

( 0 , 0 ) 

(4,5) 

( 1 , 6 ) 

03 

U 0 , 0 ) 

(2,5) 

(6,1)7 


This example has both SNE and closed cycle. The action profile (oi, hi) is an SNE and the closed cycle 
is defined as below: 



But, (ai, 6 i) is not an SSNE because according to the improved action profile set defined by ([U, both 
player can make a joint deviation from action profile (ai,&i) to ( 02 , 62 )- But, if we change the payoff 
vector corresponding to ( 02 , 62 ) from (4, 5) to (4 — a, 5) for a > 0 then (oi, 61 ) is also an SSNE. 

3 Dynamic play 

We consider the situation where n players play the strategic game defined in Section [5] We assume that 
the players can a priori communicate with each other and hence they can form a coalition and jointly 
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deviate from the current action profile to a new action profile if new action profile is strictly beneficial for 
all members of coalition. We consider the coalition formation over infinite horizon. That is, at each time 
a coalition is randomly formed and it makes a deviation from current action profile to a new action profile 
such that at new action profile the actions of the players outside the coalition remain same as before and 
each player of the coalition is strictly benefited. If there is no such improved action profile for a coalition 
then it does not deviate. The same thing repeats at next stage and it continues for infinite horizon. Such 
deviations define a coalitional better-response (CBR) dynamics. We assume that the coalition formation 
is random and at each time only one coalition can be formed. If there are more than one improved action 
profiles for a coalition then each improved action profile can be chosen with positive probability. That is, 
the CBR dynamics is stochastic. The CBR dynamics defines a Markov chain over a finite set of action 
profiles A. We also assume that at each time selected coalition makes mistake and make a joint deviation 
to an action profile where all members of the coalition are not strictly benefited. This happens with very 
small probability. Such mistakes add mutations into CBR dynamics. The mutations add another level 
of stochasticity in the CBR dynamics and as a result we have perturbed Markov chain, see e.g., Hig. 
We are interested in the action profile which is going to be selected by the CBR dynamics as mutations 
vanish. We next describe the stochastic CBR dynamics as discussed above. 

3.1 A stochastic CBR dynamics without mistakes 

At each time t = 0,1, 2, • • ■ a coalition St is selected randomly with probability pst > 0. We assume 
that at each time selected coalition makes an improving deviation from current action profile a*, i.e., at 
time t -\-l, the new action profile is G Ii{St,a*) with probability l^t, a‘) where pxi a‘) 

is a probability measure over finite set a*). When there are no improving deviations for coalition 

St then = a*. Let denotes the action profile at time t, then is a finite Markov chain on 

set A. The transition law of the Markov chain is defined as follows: 

= a'|X° = a) = ^ PS Pii(a'|5',a)lxi(s.a)(a') + X! Psl{a'=a}(a')) (3) 

SGS;Ii(S,a)/0 SgS;Ii (S,a)=0 

where 1^ is an indicator function for a given set B. It is clear that the strong Nash equilibria and closed 
cycles are the recurrent classes of P°. An SNE corresponds to an absorbing state of P® and a closed 
cycle corresponds to a recurrent class of having more than one action profiles. 

From Example 12.61 it is clear that in general the closed cycles together with strong Nash equilibria 
can be present in a game. In that case the CBR dynamics need not converge. In Example 12.61 the CBR 
dynamics need not converge to SNE (oi, bi) because once CBR dynamics enter into closed cycle given in 
Figure [2 then it will never come out of it. The closed cycle C = {( 02 , 62 ), ( 02 ,^ 3 ), ( 03 , ^ 3 ), ( 03 , 62 )} is a 
recurrent class and (oi, 61 ) is an absorbing state of Markov chain P° corresponding to the game given in 
Example 12.61 

We call a game acyclic if it has no closed cycles. The acyclic games include coordination games. There 
exists at least one SNE for acyclic games from Theorem l2.4l For acyclic games the Markov chain defined 
by ([3]) is absorbing. Hence from the theory of Markov chain the CBR dynamics given in Section I3TT] will 
be at SNE in the long run no matter from where it starts HU- 

3.2 A stochastic CBR dynamics with mistakes 

Now, we assume that at each time t the selected coalition St makes error in making a deviation from 
a* and as a result it moves to an action profile where some player (s) in the coalition St are not strictly 
better off. We assume that at action profile a‘, coalition St makes error with /(St, a‘)£ probability, where 
f : S X A ^ (0, 00 ) and 0 < e < with M = maxse 5 ,aGA/(•S', a). The factor f{St,a*) shows the 
dependence of coalition St and current action profile a*. The factor e determines the probability with 
which players in general make mistakes. These mistakes add mutations to CBR dynamics and as a result 
we have perturbed Markov chain {Xf }“q . So, at time <-|-l with probability (1—/(5't, a*)e)pxi (a‘+^ |S't, a‘) 
the perturbed Markov chain switches to G Ii(5't, a‘) and with probability /(5t, |5t, a‘) 

it switches to G Pi(5t,a‘) ; Pj^(j5t,a*) is a probability measure over finite set Ii(5t,a‘). In the 
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situation where there are no improved action profiles for coalition St, then = a* with probability 
l-/(S't,a*)e and a*+^ G Xi(5't,a*)\{a‘} with probability /(S'*,a*); Px^\[at}i-\^t,a*) 
is a probability measure over finite set Ii{St, a*) \{a‘}. The transition law of perturbed Markov chain 
is defined as below: 

= a) = ps{{l - /(S',a)e)pi,(a'|S',a)li,(s,a)(a') 

SeS;Xi{S,a)^4, 

+ f{S, a)epj^{a'\S, a)%(s,«)(«')) 

+ Ps{{^ - f{S,a)e)l{a'=a}{a') 

SeS;Ii(S,a)=0 

+ /('5>a)^PzAw(“ l‘^’“)lii(s.a)\{a}(a'))> (4) 

for all a, a' G A. 

Given all possible coalitional moves and nonzero mutations, it is possible to reach one action profile 
from another with positive probability in one step. This implies that the perturbed Markov chain {Xf}“ q 
is aperiodic and irreducible. Hence, there exists a unique stationary distribution /i® for perturbed Markov 
chain. However, when e = 0, there can be several stationary distributions corresponding to different 
SNEs or closed cycles. Such Markov chains are called singularly perturbed Markov chains nia. We are 
interested in the action profiles to which stationary distribution pA assigns positive probability as £ —>■ 0 . 
This leads to the definition of a stochastically stable action profile. 

Definition 3.1. A n action profile a is stochastically stable relative to process if > 0. 

We recall few definitions from m- From we have P'^(a'|a) > 0 for all a, a' G A. The one step 
resistance from an action profile a to an action profile a' ^ a is defined as the minimum number of 
mistakes (mutations) that are required for the transition from a to a' ^ a and it is denoted by r{a,a'). 
From (U) it is clear that the transition from a to a' has the probability of order e if a' ^ Ti(5', a) for all S 
and thus has resistance 1 and is of order 1 otherwise, so has resistance 0. So, in our setting r(a, a') G {0,1} 
for all a, a' £ A. A zero resistance between two action profiles corresponds to a transition with positive 
probability under P°. One can view the action profiles as the nodes of a directed graph that has no self 
loops and the weight of a directed edge between two different nodes is represented by one step resistance 
between them. Since P^ is an irreducible Markov chain then there must exist at least one directed path 
between any two recurrent classes Hi and Hj of P° which starts from Hi and ends at Hj. The resistance 
of any path is defined as the sum of the weights of the corresponding edges. The resistance of a path 
which is minimum among all paths from Hi to Hj is called as resistance from Hi to Hj and it is denoted 
by rij. The resistance from any action profile a® G Hi to any action profile G Hj is because inside 
Hi and Hj action profiles are connected with a path of zero resistance. Here = 1 because given all 
possible coalitional deviations it is always possible to reach from an action profile that belongs to Hi to 
an action profile belonging to Hj in exactly 1 mutation. 

Now we recall the definition of stochastic potential of a recurrent class Hi of P° from [13] . It can be 
computed by restricting to a reduced graph. Construct a graph Q where total number of nodes are the 
number of recurrent classes of P*^(one action profile from each recurrent class) and a directed edge from 
a® to a-' is weighted by r^ . That is, the resistance of a directed edge from a® to is 1. Take a node 
£ G and consider all the spanning trees such that from every node £ G, ^ , there is a unique 

path directed from to a®. Such spanning trees are called as a®-trees. The resistance of an a®-tree is the 
sum of the resistances of its edges. The stochastic potential of a® is the resistance of an a®-tree having 
minimum resistance among all a®-trees. The stochastic potential of each node in Hi is same HD, which 
is a stochastic potential of Hi. Suppose there are J number of recurrent classes of P°, then, an a®-tree 
will have J —1 number of edges and the resistance of each edge is 1. So, the resistance of each a®-tree is 
J — 1. This implies that the stochastic potential of recurrent class Hi is J — 1 and this is true for all the 
recurrent classes. So, in our case the stochastic potential of all the recurrent classes of is same. 

Theorem 3.2. All strong Nash equilibria and closed cycles of an n-player finite strategic game are 
stochastically stable. 
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Proof. We know that the Markov chain P‘^ is aperiodic and irreducible. From Q and it is easy to 
see that 

lim P^la'la) = P°(a'|a), V a, a' € A. 

e->0 

From (HI) it is clear that, if P®(a'|a) > 0 for some e € (0,eo]; then we have 

0 < £:-’'(“’“')P'^(a'|a) < oo. 

Markov chain P^ satisfies all three required conditions of Theorem 4 in m from which it follows that 
as e ^ 0, /r® converges to some stationary distribution pP of P° and an action profile a is stochastically 
stable, i.e., > 0 if and only if a is contained in a recurrent class of P^ having minimum stochastic 

potential. We know that the recurrent classes of Markov chain P° are strong Nash equilibria or closed 
cycles and the stochastic potential of all the recurrent classes are same. Thus, all the strong Nash 
equilibria and closed cycles are stochastically stable. □ 

Remark 3.3. Since the perturbed process P^ satisfies the eonditions of Theorem 4 in \15^ for all functions 
/(•), the stoehastic stability of strong Nash equilibria and closed cycles is independent of f{-). 

We can have a similar CBR dynamics without mistakes and with mistakes as given in Sections 13.11 
and 13.21 respectively, if for all S' G 5 and a € A the set of improved action profiles is l 2 (S, a) as defined 
by We have the following result. 

Theorem 3.4. All strict strong Nash equilibria and closed cyeles of an n-player finite strategic game are 
stochastically stable under eorresponding CBR dynamies. 

Proof The proof follows from the similar arguments given in Theorem 13.21 □ 

3.2.1 Equilibrium selection in coordination games 

First we consider a 2 x 2 coordination game and discuss which Nash equilibrium is selected by CBR 
dynamics in the long run when probability of making mistakes vanish. We compare our equilibrium se¬ 
lection results in 2 X 2 coordination games with existing results from mm- Later we discuss equilibrium 
selection results in general m x m symmetric coordination games. 

Consider a 2 x 2 coordination game. 


Si 

32 


Si 

(ail, bn) 

(021,& 2 l) 


S2 


(ai2,612) 
(022,^22) 


where ajk,bjk G K, j,k G {1,2} and an > 021 , 611 > 612 , 022 > 012 , 622 > 621 - Ai = {si,S 2 }, i = 1,2. 
Here (si, si) and (s 2 , S 2 ) are two strict Nash equilibria. In this game there are two types of Nash equilibria 
one is payoff dominant and other one is risk dominant. If an > 022 , 611 > ^ 22 , then (si,si) is payoff 
dominant and if an < 022 , bn < 622 , then (s 2 ,S 2 ) is payoff dominant. In other cases payoff dominant 
Nash equilibrium does not exist. From HU, define, 

_ «n — « 2 i _ ^11 — bi 2 _ 

On — ai2 — 021 + 0-22 ’ ^11 — ^12 — ^21 + b22 

022 — ai2 622 — ^21 

On — 012 — 021 + 022 ’ bn — 612 — 621 + ^22 

If Pi > P 2 , then (si, si) is risk dominant Nash equilibrium and if P 2 > Pi, then (s 2 , S 2 ) is risk dominant 
Nash equilibrium. A payoff dominant Nash equilibrium is always an SNE. Hence, CBR dynamics always 
selects payoff dominant Nash equilibrium whenever it exists. When payoff dominant Nash equilibrium 
does not exist then both the Nash equilibria are strong Nash equilibria and in that case CBR dynamics 
selects both the Nash equilibria. While the stochastic dynamics by Young m always selects a risk 
dominant Nash equilibrium. 


Pi = min 

P 2 = min 
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A 2 X 2 symmetric coordination game is considered by Kandori et al. m- For this game ajk = bkj, 
j, k € {1, 2}. In this case there always exists a payoff dominant Nash equilibrium. Hence CBR dynamics 
always selects payoff dominant Nash equilibrium which is the only SNE. While the stochastic dynamics 
by Kandori et al. m always selects a risk dominant Nash equilibrium. Among symmetric coordination 
games if we go beyond 2x2 matrix games the result by Young m cannot be generalized, i.e., it need 
not select a risk dominant Nash equilibrium. Consider an example of 3 x 3 matrix game from na, 



Si 

52 

S 3 

Si 

^( 6 , 6 ) 

(0,5) 

( 0 , 0 )^ 

52 

(5,0) 

(7,7) 

(5,5) 

S3 

v(o>o) 

(5,5) 

( 8 , 8 )y 


Here (si,si), (s 2 ,S 2 ) and ( 53 , 53 ) are three Nash equilibria. The stochastic dynamics by Young [15] 
selects (52, 52 ) that is not a risk dominant Nash equilibrium. A Nash equilibrium of an m x m symmetric 
coordination game is risk dominant if it is risk dominant in all pairwise contest (Sj- For above 3x3 game, 
the Nash equilibrium ( 53 , 53 ) is a risk dominant as well as a payoff dominant and also an SNE. Hence, 
CBR dynamics selects ( 53 , S 3 ). In fact for all m x m symmetric coordination game, CBR dynamics always 
selects a payoff dominant Nash equilibrium because it is an SNE. 


4 Application to network formation games 

In this section we consider the network formation games, see e.g., some recent books [T^, |S|, [Ij. In 
general, the networks which are stable against the deviation of all the coalitions are called as strongly 
stable networks. In the literature, there are two definitions of strongly stable networks. The first definition 
is due to Dutta and Mutuswami [3] that is corresponding to SNE. The second definition is due to Jackson 
and van den Nouweland El that is corresponding SSNE. A strongly stable network according to the 
definition of m is also strongly stable network according to the definition of |5]. The definition of a 
strongly stable network according to Jackson and van den Nouweland im are more often considered 
in the literature. We also consider the strong stability of networks according to Jackson and van den 
Nouweland HH. We discuss the dynamic formation of networks over infinite horizon. We apply the 
CBR dynamics corresponding to SSNE to network formation games to discuss the stochastic stability of 
networks. 

4.1 The model 

Let = {1, 2, • • • , n} be a finite set of players also called as nodes. The players are connected through 
undirected edges. An edge can be defined as a subset of N of size 2, e.g., {ij} C N defines an edge 
between player i and player j. The collection of edges define a network. Let G denotes a set of all 
networks on N. For each i G A^, let Ui : G ^ R be a payoff function of player i, where Ui{g) is a payoff 
of player i at network g. 

To reach from one network to another requires the addition of new links or the destruction of existing 
links. It is always assumed in the literature that forming a new link requires the consent of both the 
players while a player can delete a link unilaterally. The coalition formation in network formation games 
has also been considered in the literature. Some players in a network can form a coalition and make a 
joint move to another network by adding or severing some links, if new network is at least as beneficial 
as the previous network for all the players of coalition and at least one player is strictly benefited (see 
m, 0)- We recall few definitions from m describing the coalitional moves in network formation games 
and the stability of networks against all possible coalitional deviations. 

Definition 4.1. A network g' is obtainable from g via deviation by a coalition S € S as denoted by 
9 -^S 9 ', if 

1. ij € g' and ij ^ g then {i^j} C S. 
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2. ij G g and ij ^ g' then {i,j} D S ^ ip. 

The first condition of the above definition requires that a new link can be added only between the nodes 
which are the part of a coalition S and the second condition requires that at least one node of any deleted 
link has to be a part of a coalition S. We denote G{S, g) as a set of all networks which are obtainable 
from g via deviation by S, i.e., G{s,g) = {g'\g -^s g'}- 

Definition 4.2. A deviation by a coalition S from a network g to a network g' is said to be improving if 

1- 9 -^s g', 

2. Ui{g') > Ui{g), y i G S (with at least one strict inequality). 

We denote l 2 {S,g) as a set of all networks g' which are obtainable from g by an improving deviation 
of S, i.e., 


^ 2 {S,g) = {g'\g -)>s 9 ',u^{g') > u^{g), V i G S,Uj{g') > Uj{g) for some j G S}. 

It is clear that g ^ l 2 {S,g) for all S. We denote X 2 {S,g) = G{S,g) \l 2 {S,g) as a set of all networks 
which are obtainable from g due to erroneous decisions of S. This set is always nonempty as g G I 2 (>S', g) 
for all S. 

Definition 4.3. A network g is said to be strongly stable if it is not possible for any coalition S to make 
an improving deviation from network g to some other network g'. 

A strongly stable network need not always exist and in that case there exists some set of networks lying 
on a closed cycle and all the networks in a closed cycle can be reached from each other via an improving 
path. An improving path and a closed cycle in network formation games can be defined similarly to 
Definitions 12.21 and 12.31 respectively. 

Theorem 4.4. There exists at least a strongly stable network or a closed cycle of networks. 

Proof. The proof follows from the similar arguments used in Theorem 12.41 □ 

4.2 Dynamic network formation 

The paper by Jackson and Watts m is the first one to consider the dynamic formation of networks. They 
considered the case where at each time only a pair of players form a coalition and only a link between 
them can be altered. We consider the situation where at each time a subset of players form a coalition 
and deviate from a current network to a new network if at new network the payoff of each player of the 
coalition is at least as much as at current network and at least one player has strictly better payoff. This 
process continues over inhnite horizon. A coalition can make all possible changes in the network and as a 
result more than one link can be created or severed at each time. So, we consider the following network 
formation rules by Jackson and van den Nouweland m given below: 

• Link addition is bilateral, i.e., forming a link between player i and player j requires the consent of 
both players. 

• Link destruction is unilateral, i.e., severing a link between player i and player j requires that player 
i or player j or both agree to sever the link. 

• At a time more than one link can be created or severed by the players. 

The CBR dynamics corresponding to SSNE can be applied to dynamic network formation. That is, at 
time t a network is gt and a coalition St is selected with probability ps^ > 0 and it makes an improving 
deviation to a new network that is at least as beneficial as gt for all players of coalition St and at least 
one player of St is strictly benefited. So, at time t + 1 network is gt+i G l 2 {St,gt) with probability 
px. 2 {gt+i\St, gt)- If an improving deviation is not possible for selected coalition St, then gt+i = gt- The 
above process defines a Markov chain over state space G and its transition probabilities can be defined 
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similarly to ([31) ■ In general this Markov chain is multichain whose absorbing set is either a strongly stable 
network or a recurrent class having more than one network is a closed cycle of networks. We can also 
assume that at each time selected coalition St makes error with small probability f{St,gt)£- That is, 
gt+i e l 2 {St,gt) with probability (1 - f{St,gt)£)piAgt+i\St,gt) and gt+i € T 2 {St,gt) with probability 
f{St,gt)£Px^{9t+i\St,gt)- The transition probabilities of the perturbed Markov chain can be defined 
similarly to The presence of mutations makes the Markov chain ergodic for which there exists a 
unique stationary distribution. We are interested in the stochastically stable networks, i.e., the networks 
to which positive probabilities are assigned by the stationary distribution as e —>■ 0. The stochastic 
stability analysis similar to the one given in Section 13?^ holds here. Thus, we have the following result. 

Theorem 4.5. All the strongly stable networks and closed cycles of a network formation game with the 
corresponding CBR dynamics are stochastically stable. 

Proof. The proof follows directly from Theorem 13.41 □ 


5 Conclusions 

We introduce coalition formation among players in an n-player strategic game over infinite horizon and 
propose a CBR dynamics. The mutations are present in the dynamics due to erroneous decisions taken by 
the coalitions. We prove that all strong Nash equilibria and closed cycles of action profiles are stochasti¬ 
cally stable, i.e., they are selected by the CBR dynamics as mutations vanish. Similar development holds 
for strict strong Nash equilibria. We applied CBR dynamics to network formation games and prove that 
all strongly stable networks and closed cycles of networks are stochastically stable. 
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A Algorithm for computing strong Nash equilibrium 

We give a finite step algorithm that computes an SNE whenever it exists. If an SNE does not exist then 
in finite number of steps the algorithm conhrms that there is no SNE. From the definition of SNE an 
action profile a is an SNE if there is no improved action profile a! ^ a for any coalition S G S, i.e., 
Ii{S, a) = (j) for all S € S. 


Algorithm 1 

1 ; Choose a G A. 

2 : Choose S G S. 

3: Choose a'g G Ag. 

4; if Ui{a'g,a-s) > Ui{a), y i G S then 
5; A = A\{a}. 

6; if |A| = 0 then 

7 : Go to Step [50] 

8; else 

9; Go to StepjTJ 

10: else 

11: As = As \ {a'sl 

12: if I As I = 0 then 

13: S = S\S. 

14: if |iS| = 0 then 

15: Go to Step HlJ 

16: else 

17: Go to Step H] 

18: else 

19: Go to Step 13] 

20 : Strong Nash equilibrium does not exist 
21 : a is Strong Nash equilibrium 


The Algorithm ]T] terminates in finite number of steps because A and S are finite. If we replace Step 
]3]of the Algorithm ]T] by Ui{a'g,a-s) > Ui{a), y i G S together with at least one strict inequality, then 
Algorithm ]T] computes an SSNE whenever it exists. 
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